Name

RosettaMan, rman - reverse compile man pages from formatted form to a number of source formats

Synopsis

rman [options] [file]

Description

RosettaMan takes formatted man pages from most of the popular flavors of UNIX and transforms them into any of a number of text source formats. RosettaMan accepts formatted man pages from: Hewlett-Packard HP-UX, AT&T System V, SunOS, Sun Solaris, OSF/1, DEC Ultrix, SGI IRIX, Linux. It can produce ASCII-only, section headers-only, TkMan, [tn]roff, Ensemble, SGML, HTML, LaTeX, RTF, Perl 5 POD. A modular architecture permits easy addition of additional output formats.

Options

-f <ASCII|roff|TkMan|Ensemble|Sections|HTML|SGML|LaTeX|RTF|POD>
Set the output filter. Defaults to ASCII.
-b
Try to recognize subsection titles in addition to section titles. This can cause problems on some UNIX systems.
-t #
For those macros sets that use tabs in place of spaces where possible in order to reduce the number of characters used, set tabstops every # columns. Defaults to 8.
-c
Move changebars, such as those found in the Tcl/Tk manual pages, to the left.
-m
Disable aggressive man page parsing. Aggressive manual, which is on by default, page parsing elides headers and footers, identifies sections and more.
-v
Show version number and exit.
-n name
Set name of man page (used in roff format). If the filename is given in the form "name.section", the name and section are automatically determined.
-s #
Set volume (aka section) number of man page (used in roff format).
-p
paragraph mode toggle. The filter determines whether lines should be linebroken as they were by nroff, or whether lines should be flowed together into paragraphs. Mainly for internal use.
-r printf-string
In HTML mode this sets the URL form by which to retrieve other man pages. The string can use two supplied parameters: the man page name and its section. (See the Examples section.)
-l printf-string
In HTML mode this sets the <TITLE> of the man pages, given the same parameters as -r.
-K
Indicate manual pages don't have page breaks, so don't look for footers and headers around them. (Older nroff -man macros always put in page breaks, but lately some vendors have realized that printout are made through troff, whereas nroff -man is used to format pages for reading on screen, and so have eliminated page breaks.) RosettaMan usually gets this right even without this flag.
-T
Turn on aggressive table parsing.

Notes on Filter Types

ROFF

Some flavors of UNIX ship without [tn]roff source, making one's laser printer, for man pages, little more than a laser line printer. This filer tries to intuit the original [tn]roff directives, which can then be recompiled by [tn]roff.

TkMan

TkMan, a hypertext man page browser, uses RosettaMan to show man pages without the (usually) useless headers and footers on each pages. It also collects section and (optionally) subsection heads for direct access from a pulldown menu. TkMan and Tcl/Tk, the toolkit in which it's written, are available via anonymous ftp from ftp.cs.berkeley.edu in the directories /ucb/people/phelps/tcltk and /ucb/tcl.

Ensemble

Ensemble, a multimedia editor of structured documents, is currently being developed by the research groups of Professors Michael A. Harrison and Susan L. Graham at the University of California, Berkeley. With proper structure and presentation specifications (schemas), the appearance of a manual page can be radically transformed by Ensemble.

ASCII

When printed on a line printer, man pages try to produce special text effects by overstriking characters with themselves (to produce bold) and underscores (underlining). Other text processing software, such as text editors, searchers, and indexers, must counteract this. The ASCII filter strips away this formatting. Piping nroff output through col -b also strips away this formatting, but it leaves behind unsightly page headers and footers.

Sections

Dumps section and (optionally) subsection titles.

HTML

With a simple extention to an HTTP server for Mosaic or other World Wide Web browser, RosettaMan can produce high quality HTML on the fly. Several such extensions and pointers to several others are included in RosettaMan's contrib directory.

SGML

I just discovered the Davenport DTD, and support for it will be coming Real Soon Now.

LaTeX

Why not?

RTF

Use output on Mac or NeXT or whatever. Maybe take random man pages and integrate with NeXT's documentation system better. Maybe NeXT has own man page macros that do this.

To produce PostScript, use groff or psroff. To produce FrameMaker MIF, use FrameMaker's builtin filter. In both cases you need [tn]roff source, so if you only have a formatted version of the manual page, use RosettaMan's roff filter first.

Examples

To convert the formatted man page named ls.1 back into [tn]roff source form:

rman -f roff /usr/local/man/cat1/ls.1 > /usr/local/man/man1/ls.1

Long man pages are often compressed to conserve space (compression is especially effective on formatted man pages as many of the characters are spaces). As it is a long man page, it probably has subsections, which we try to separate out (some macro sets don't distinguish subsections well enough for RosettaMan to detect them). Let's convert this to LaTeX format:
pcat /usr/catman/a_man/cat1/automount.z | rman -b -n automount -s 1 -f latex > automount.man

Alternatively, man 1 automount | rman -b -n automount -s 1 -f latex > automount.man

For HTML/Mosaic users, RosettaMan can, without modification of the source code, produce HTML links that point to other HTML man pages either pregenerated or generated on the fly. First let's assume pregenerated HTML versions of man pages stored in /usr/man/html. Generate these one-by-one with the following form:
rman -f html -r 'http:/usr/man/html/%s.%s.html' /usr/man/cat1/ls.1 > /usr/man/html/ls.1.html

If you've extended your HTML client to generate HTML on the fly you should use something like:
rman -f html -r 'http:~/bin/man2html?%s:%s' /usr/man/cat1/ls.1
when generating HTML.

Bugs/Incompatibilities

RosettaMan is not perfect in all cases, but it usually does a good job, and in any case reduces the problem of converting man pages to light editing.

Tables, especially H-P's, aren't handled very well; fortunately, tables seem to be rare in man pages.

The man pager woman applies its own idea of formatting for man pages, which can confuse RosettaMan. Bypass woman by passing the formatted manual page text directly into RosettaMan.

The [tn]roff output format uses \fB to turn on boldface. If your macro set requires .B, you'll have to a postprocess the RosettaMan output.

See Also

tkman(1), xman(1), man(1), man(7)

Copyright

RosettaMan
Copyright (c) 1993-1994 T.A. Phelps (phelps@CS.Berkeley.EDU)
All Rights Reserved.

Permission to use, copy, modify, and distribute this software and its documentation for educational, research and non-profit purposes, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and the following paragraph appears in all copies.

Permission to incorporate this software into commercial products may be obtained from the Office of Technology Licensing, 2150 Shattuck Avenue, Suite 510, Berkeley, CA 94704.

Manual page last updated on $Date: 1994/12/07 01:09:08 $